Lightweight Lempel-Ziv Parsing

نویسندگان

  • Juha Kärkkäinen
  • Dominik Kempa
  • Simon J. Puglisi
چکیده

We introduce a new approach to LZ77 factorization that uses O(n/d) words of working space and O(dn) time for any d ≥ 1 (for polylogarithmic alphabet sizes). We also describe carefully engineered implementations of alternative approaches to lightweight LZ77 factorization. Extensive experiments show that the new algorithm is superior in most cases, particularly at the lowest memory levels and for highly repetitive data. As a part of the algorithm, we describe new methods for computing matching statistics which may be of independent interest.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Faster Lightweight Lempel-Ziv Parsing

We present an algorithm that computes the Lempel-Ziv decomposition in O(n(log σ+log log n)) time and n log σ+ǫn bits of space, where ǫ is a constant rational parameter, n is the length of the input string, and σ is the alphabet size. The n log σ bits in the space bound are for the input string itself which is treated as read-only.

متن کامل

On Tinhofer's Linear Programming Approach to Isomorphism Testing

On the complexity of master problems Emergence on decreasing sandpile models 14:35 Kosolobov Durand, Romashchenko Faster lightweight Lempel-Ziv parsing Quasiperiodicity and non-computability in tilings On the Complexity of Noncommutative Polynomial Factorization

متن کامل

Lempel-Ziv Dimension for Lempel-Ziv Compression

This paper describes the Lempel-Ziv dimension (Hausdorff like dimension inspired in the LZ78 parsing), its fundamental properties and relation with Hausdorff dimension. It is shown that in the case of individual infinite sequences, the Lempel-Ziv dimension matches with the asymptotical Lempel-Ziv compression ratio. This fact is used to describe results on Lempel-Ziv compression in terms of dime...

متن کامل

On Generalized Digital Search Trees with Applicationsto a Generalized Lempel - Ziv

The goal of this research is twofold: (i) to analyze generalized digital search trees, and (ii) to derive the average proole (i.e., phrase length) of a generalization of the well known parsing algorithm due to Lempel and Ziv. In the generalized Lempel-Ziv parsing scheme, one partitions a sequence of symbols from a nite alphabet into phrases such that the new phrase is the longest substring seen...

متن کامل

Asymmetry in Ziv / Lempel ' 78

We the compare the number of phrases created by Ziv/Lempel '78 parsing of a binary sequence and of its reversal. We show that the two parsings can vary by a factor that grows at least as fast as the logarithm of the sequence length. We then show that under a suitable condition, the factor can even become polynomial, and argue that the condition may not be necessary.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013